Goto

Collaborating Authors

 tena cucala


Sound Logical Explanations for Mean Aggregation Graph Neural Networks

Morris, Matthew, Horrocks, Ian

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) are frequently used for knowledge graph completion. Their black-box nature has motivated work that uses sound logical rules to explain predictions and characterise their expressivity. However, despite the prevalence of GNNs that use mean as an aggregation function, explainability and expressivity results are lacking for them. We consider GNNs with mean aggregation and non-negative weights (MAGNNs), proving the precise class of monotonic rules that can be sound for them, as well as providing a restricted fragment of first-order logic to explain any MAGNN prediction. Our experiments show that restricting mean-aggregation GNNs to have non-negative weights yields comparable or improved performance on standard inductive benchmarks, that sound rules are obtained in practice, that insightful explanations can be generated in practice, and that the sound rules can expose issues in the trained models.


Logical Expressivity and Explanations for Monotonic GNNs with Scoring Functions

Morris, Matthew, Cucala, David J. Tena, Grau, Bernardo Cuenca

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) are often used for the task of link prediction: predicting missing binary facts in knowledge graphs (KGs). To address the lack of explainability of GNNs on KGs, recent works extract Datalog rules from GNNs with provable correspondence guarantees. The extracted rules can be used to explain the GNN's predictions; furthermore, they can help characterise the expressive power of various GNN models. However, these works address only a form of link prediction based on a restricted, low-expressivity graph encoding/decoding method. In this paper, we consider a more general and popular approach for link prediction where a scoring function is used to decode the GNN output into fact predictions. We show how GNNs and scoring functions can be adapted to be monotonic, use the monotonicity to extract sound rules for explaining predictions, and leverage existing results about the kind of rules that scoring functions can capture. We also define procedures for obtaining equivalent Datalog programs for certain classes of monotonic GNNs with scoring functions. Our experiments show that, on link prediction benchmarks, monotonic GNNs and scoring functions perform well in practice and yield many sound rules.


Relational Graph Convolutional Networks Do Not Learn Sound Rules

Morris, Matthew, Cucala, David J. Tena, Grau, Bernardo Cuenca, Horrocks, Ian

arXiv.org Artificial Intelligence

Graph neural networks (GNNs) are frequently used to predict missing facts in knowledge graphs (KGs). Motivated by the lack of explainability for the outputs of these models, recent work has aimed to explain their predictions using Datalog, a widely used logic-based formalism. However, such work has been restricted to certain subclasses of GNNs. In this paper, we consider one of the most popular GNN architectures for KGs, R-GCN, and we provide two methods to extract rules that explain its predictions and are sound, in the sense that each fact derived by the rules is also predicted by the GNN, for any input dataset. Furthermore, we provide a method that can verify that certain classes of Datalog rules are not sound for the R-GCN. In our experiments, we train R-GCNs on KG completion benchmarks, and we are able to verify that no Datalog rule is sound for these models, even though the models often obtain high to near-perfect accuracy. This raises some concerns about the ability of R-GCN models to generalise and about the explainability of their predictions. We further provide two variations to the training paradigm of R-GCN that encourage it to learn sound rules and find a trade-off between model accuracy and the number of learned sound rules.


On the Correspondence Between Monotonic Max-Sum GNNs and Datalog

Cucala, David Tena, Grau, Bernardo Cuenca, Motik, Boris, Kostylev, Egor V.

arXiv.org Artificial Intelligence

Although there has been significant interest in applying machine learning techniques to structured data, the expressivity (i.e., a description of what can be learned) of such techniques is still poorly understood. In this paper, we study data transformations based on graph neural networks (GNNs). First, we note that the choice of how a dataset is encoded into a numeric form processable by a GNN can obscure the characterisation of a model's expressivity, and we argue that a canonical encoding provides an appropriate basis. Second, we study the expressivity of monotonic max-sum GNNs, which cover a subclass of GNNs with max and sum aggregation functions. We show that, for each such GNN, one can compute a Datalog program such that applying the GNN to any dataset produces the same facts as a single round of application of the program's rules to the dataset. Monotonic max-sum GNNs can sum an unbounded number of feature vectors which can result in arbitrarily large feature values, whereas rule application requires only a bounded number of constants. Hence, our result shows that the unbounded summation of monotonic max-sum GNNs does not increase their expressive power. Third, we sharpen our result to the subclass of monotonic max GNNs, which use only the max aggregation function, and identify a corresponding class of Datalog programs.


Consequence-Based Reasoning for Description Logics with Disjunctions and Number Restrictions

Bate, Andrew, Motik, Boris, Cuenca Grau, Bernardo, Tena Cucala, David, Simančík, František, Horrocks, Ian

Journal of Artificial Intelligence Research

Classification of description logic (DL) ontologies is a key computational problem in modern data management applications, so considerable effort has been devoted to the development and optimisation of practical reasoning calculi. Consequence-based calculi combine ideas from hypertableau and resolution in a way that has proved very effective in practice. However, existing consequence-based calculi can handle either Horn DLs (which do not support disjunction) or DLs without number restrictions. In this paper, we overcome this important limitation and present the first consequence-based calculus for deciding concept subsumption in the DL ALCHIQ+. Our calculus runs in exponential time assuming unary coding of numbers, and on ELH ontologies it runs in polynomial time. The extension to disjunctions and number restrictions is technically involved: we capture the relevant consequences using first-order clauses, and our inference rules adapt paramodulation techniques from first-order theorem proving. By using a well-known preprocessing step, the calculus can also decide concept subsumptions in SRIQ---a rich DL that covers all features of OWL 2 DL apart from nominals and datatypes. We have implemented our calculus in a new reasoner called Sequoia. We present the architecture of our reasoner and discuss several novel and important implementation techniques such as clause indexing and redundancy elimination. Finally, we present the results of an extensive performance evaluation, which revealed Sequoia to be competitive with existing reasoners. Thus, the calculus and the techniques we present in this paper provide an important addition to the repertoire of practical implementation techniques for description logic reasoning.